Search results for "cyclic reduction"
showing 8 items of 8 documents
A New Augmented Lagrangian Approach for $L^1$-mean Curvature Image Denoising
2015
Variational methods are commonly used to solve noise removal problems. In this paper, we present an augmented Lagrangian-based approach that uses a discrete form of the L1-norm of the mean curvature of the graph of the image as a regularizer, discretization being achieved via a finite element method. When a particular alternating direction method of multipliers is applied to the solution of the resulting saddle-point problem, this solution reduces to an iterative sequential solution of four subproblems. These subproblems are solved using Newton’s method, the conjugate gradient method, and a partial solution variant of the cyclic reduction method. The approach considered here differs from ex…
A parallel radix-4 block cyclic reduction algorithm
2014
A conventional block cyclic reduction algorithm operates by halving the size of the linear system at each reduction step, that is, the algorithm is a radix-2 method. An algorithm analogous to the block cyclic reduction known as the radix-q partial solution variant of the cyclic reduction (PSCR) method allows the use of higher radix numbers and is thus more suitable for parallel architectures as it requires fever reduction steps. This paper presents an alternative and more intuitive way of deriving a radix-4 block cyclic reduction method for systems with a coefficient matrix of the form tridiag{ − I,D, − I}. This is performed by modifying an existing radix-2 block cyclic reduction method. Th…
A parallel radix-4 block cyclic reduction algorithm
2013
SUMMARY A conventional block cyclic reduction algorithm operates by halving the size of the linear system at each reduction step, that is, the algorithm is a radix-2 method. An algorithm analogous to the block cyclic reduction known as the radix-q partial solution variant of the cyclic reduction (PSCR) method allows the use of higher radix numbers and is thus more suitable for parallel architectures as it requires fever reduction steps. This paper presents an alternative and more intuitive way of deriving a radix-4 block cyclic reduction method for systems with a coefficient matrix of the form tridiag{ − I,D, − I}. This is performed by modifying an existing radix-2 block cyclic reduction me…
On solving separable block tridiagonal linear systems using a GPU implementation of radix-4 PSCR method
2018
Partial solution variant of the cyclic reduction (PSCR) method is a direct solver that can be applied to certain types of separable block tridiagonal linear systems. Such linear systems arise, e.g., from the Poisson and the Helmholtz equations discretized with bilinear finite-elements. Furthermore, the separability of the linear system entails that the discretization domain has to be rectangular and the discretization mesh orthogonal. A generalized graphics processing unit (GPU) implementation of the PSCR method is presented. The numerical results indicate up to 24-fold speedups when compared to an equivalent CPU implementation that utilizes a single CPU core. Attained floating point perfor…
Poissonin yhtälön nopeat ratkaisijat
2016
Tutkielmassa esitellään Poissonin yhtälö sekä sen diskretointi. Lisäksi käydään läpi kaksi nopeaa numeerista menetelmää yhtälön ratkaisemiseksi. Yksinkertaisuuden vuoksi rajoitutaan kaksiulotteisiin tehtäviin, joissa on voimassa Dirichle’t reunaehto. Ensimmäinen menetelmistä on monihilamenetelmä, joka on iteratiivinen menetelmä, ja toisena syklinen reduktio, joka on suora menetelmä. Molemmat menetelmät ovat hyvin tehokkaita sekä helposti rinnakkaistuvia. In this thesis we introduce Poisson’s equation and its discretization. In addition we go through two fast numerical methods for solving the equation. The thesis is limited only to two-dimensional cases with Dirichlet boundary condition. The…
On GPU-accelerated fast direct solvers and their applications in image denoising
2015
Fast Poisson solvers for graphics processing units
2013
Two block cyclic reduction linear system solvers are considered and implemented using the OpenCL framework. The topics of interest include a simplified scalar cyclic reduction tridiagonal system solver and the impact of increasing the radix-number of the algorithm. Both implementations are tested for the Poisson problem in two and three dimensions, using a Nvidia GTX 580 series GPU and double precision floating-point arithmetic. The numerical results indicate up to 6-fold speed increase in the case of the two-dimensional problems and up to 3- fold speed increase in the case of the three-dimensional problems when compared to equivalent CPU implementations run on a Intel Core i7 quad-core CPU…
Numerical experiments with a parallel fast direct elliptic solver on Cray T3E
1997
A parallel fast direct O(N log N) solver is shortly described for linear systems with separable block tridiagonal matrices. A good parallel scalability of the proposed method is demonstrated on a Cray T3E parallel computer using MPI in communication. Also, the sequential performance is compared with the well-known BLKTRI-implementation of the generalized. cyclic reduction method using a single processor of Cray T3E.